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Short blunt-ended DNA duplexes comprising 6 to 20 base pairs self-assemble into polydisperse 
semi-flexible chains due to hydrophobic stacking interactions between terminal base pairs. Above 
a critical concentration, which depends on temperature and duplex length, such chains order into 
liquid crystal phases. Here, we investigate the self-assembly of such double-helical duplexes with a 
combined numerical and theoretical approach. We simulate the bulk system employing the coarse- 
grained DNA model recently proposed by Ouldridge et al. [ J. Chem. Phys. 134, 08501 (2011) ]. 
Then we evaluate the input quantities for the theoretical framework directly from the DNA model. 
The resulting parameter- free theoretical predictions provide an accurate description of the simulation 
results in the isotropic phase. In addition, the theoretical isotropic-nematic phase boundaries are in 
line with experimental findings, providing a route to estimate the stacking free energy. 



PACS numbers: 

I. INTRODUCTION 

Self-assembly is the spontaneous formation through 
free energy minimization of reversible aggregates of basic 
building blocks. The size of the aggregating units, e.g. 
simple molecules, macromolecules or colloidal particles, 
can vary from a few angstroms to microns, thus mak- 
ing self-assembly ubiquitous in nature and of interest in 
several fields, including material science, soft matter and 
biophysics [TH5]. Through self-assembly it is possible to 
design new materials whose physical properties are con- 
trolled by tuning the interactions of the individual build- 
ing blocks [BHS]. 

A relevant self-assembly process is the formation of 
filamentous aggregates (i.e. linear chains) induced by 
the anisotropy of attractive interactions. Examples are 
provided by micellar systems |10H12j , formation of fibers 
and fibrils [T3lfl6] , solutions of long duplex B-form DNA 
composed of 10^ to 10^ base pairs [T7H^ . filamentous 
viruses [HHH], chromonic liquid crystals [IB] as well as 
inorganic nanoparticles [27j . 

If linear aggregates possess sufficient rigidity, the sys- 
tem may exhibit liquid crystal (LC) phases (e.g. ne- 
matic or columnar) above a critical concentration. In 
the present study we focus on the self-assembly of short 
(i.e. 6 to 20 base pairs) DNA duplexes (DNADs) ^51 - 
[30 in which coaxial stacking interactions between the 
blunt ends of the DNADs favor their aggregation into 
weakly bonded chains. Such a reversible physical poly- 
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merization is enough to promote the mutual alignment of 
these chains and the formation of macroscopically orien- 
tationally ordered nematic LC phases. At present, stack- 
ing is understood in terms of hydrophobic forces acting 
between the fiat hydrocarbon surfaces provided by the 
paired nucleobases at the duplex terminals, although the 
debate on the physical origin of such interaction is still ac- 
tive and lively [3T1[32]- In this respect, the self-assembly 
of DNA duplexes provides a suitable way to access and 
quantify hydrophobic coaxial stacking interactions. 

In order to extract quantitative informations from 
DNA-DNA coaxial stacking experiments, reliable compu- 
tational models and theoretical frameworks are needed. 
Recent theoretical approaches have focused on the 
isotropic-nematic (LN) transition in self-assembling sys- 
tems [331 134) . building on previous work on rigid and 
semi-fiexible polymers [351 - 143] . In a recent publica- 
tion [33] we investigated the reversible physical poly- 
merization and collective ordering of DNA duplexes by 
modeling them as super-quadrics with quasi-cylindrical 
shape i45j with two reactive sites [46l [47] on their bases. 
Then we presented a theoretical framework, built on 
Wertheim [^SHSU] and Onsager f5T] theories, which is able 
to properly account for the association process. 

Here, we employ this theoretical framework to study 
the physical properties of a realistic coarse-grained model 
of DNA recently proposed by Ouldridge et al. [52] , where 
nucleotides are modeled as rigid bodies interacting with 
site-site potentials. Following Ref. [33]: we compute the 
inputs required by the theory, i.e. the stacking free en- 
ergy and the DNAD excluded volume, for the Ouldridge 
et al. model [55]. Subsequently we predict the poly- 
merization extent in the isotropic phase as well as the 
isotropic-nematic phase boundaries. 

To validate the theoretical predictions, we perform 
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large-scale molecular dynamics (MD) simulations in the 
NVT ensemble of a bulk system comprising 9600 nu- 
cleotides, a study made possible by the computational 
power of modern Graphical Processing Units (GPUs). 
The parameter-free theoretical predictions provide an 
accurate description of the simulation results in the 
isotropic phase, supporting the theoretical approach and 
its application in the comparison with experimental re- 
sults. 

The article is organized as follows. Section [Til provides 
details of the coarse-grained model of DNADs, of the 
MD computer simulations and presents a summary of 
the theory. Section |IV] describes the protocols imple- 
mented to evaluate the input parameters required by the 
theory via MC integrations for two DNADs. We also 
discuss some geometrical properties of the bonded dimer 
configurations. We then compare the theoretical predic- 
tions with simulation and experimental results. Finally, 
in Section |V] we discuss estimates for the stacking free 
energy and present our conclusions. 



II. METHODS 
A. Model 

We implement a coarse-grained model for DNA re- 
cently developed by Ouldridge et al. [511 IS3]- In such 
model, designed via a top-down approach, each nu- 
cleotide is described as a rigid body (see Figure [T]). The 
interaction forms and parameters are chosen so as to 
reproduce structural and thermodynamic properties of 
both single- (ssDNA) and double- (dsDNA) stranded 
molecules of DNA in B-form. All interactions between 
nucleotides are pairwise and, in the last version of the 
model [52] . continuous and differentiable. Such feature 
makes the model convenient foe MD simulations. 

The interactions between nucleotides account for ex- 
cluded volume, backbone connectivity, Watson-Crick hy- 
drogen bonding, stacking, cross-stacking and coaxial- 
stacking. The interaction parameters have been adjusted 
in order to be consistent with experimental data [52| [5¥j, 
[55] . In particular, the stacking interaction strength and 
stiffness have been chosen to be consistent with the ex- 
perimental results reported for 14-base oligomers by Hol- 
brook et al. |55| . Hydrogen-bonding and cross-stacking 
potentials were adjusted to give duplex and hairpin for- 
mation thermodynamics consistent with the SantaLucia 
parameterization of the nearest-neighbor model 54 . In- 
teraction stiffnesses were also further adjusted in order to 
provide correct mechanical properties of the model, such 
as the persistence length. The model does not have any 
sequence dependence apart from the Watson-Crick pair- 
ing, meaning that the strength of the interactions acting 
between nucleotides is to be considered as an average 
value. In addition, the model assumes conditions of high 
salt molarity (0.5 M). In this model, the coaxial-stacking 
interaction acts between any two non-bonded nucleotides 



and is responsible for the duplex-duplex bonding. It has 
been parametrized |56] to reproduce experimental data 
which quantify the stacking interactions by observing 
the difference in the relative mobility of a double strand 
where one of the two strands has been nicked with respect 
to intact DNA [57j[58] and by analyzing the melting tem- 
peratures of short duplexes adjacent to hairpins [5^ . 

To cope with the complexity of the model and the large 
number of nucleotides involved in bulk simulations, we 
employ a modified version of the CPU-CPU code used in 
a previous work [^, and extend it to support the force- 
fields [61]. Harvesting the power of modern Graphical 
Processing Units (GPUs) results in a 30-fold speed-up. 
The CPU version of the code, as well as the Python li- 
brary written to simplify generation of initial configura- 
tions and post-processing analysis, will soon be released 
as free software I62i. 



B. Bulk simulations 

To compare numerical results with theory, we perform 
Brownian dynamics simulations of 400 dsDNA molecules 
made up by 24 nucleotides each, i.e. 400 cylinder-like 
objects with an aspect ratio of ~ 2 (see Figure [T] (d)). 
The integration time step has been chosen to be 0.003 in 
simulation units which corresponds, if rescaled with the 
units of length, mass and energy used in the model, to 
approximately 1 x 10""'^* seconds. 

We study systems at three different temperatures, 
namely T = 270 K, 285 K and 300 K, and for different 
concentrations, ranging from 2 mg/ml to 241 mg/ml. The 
T — 270 K state point, despite being far from the ex- 
perimentally accessed T, is here investigated to test the 
theory in a region of the phase diagram where the degree 
of association is significant. To quantify the aggregation 
process we define two DNADs as bonded if their pair in- 
teraction energy is negative. Depending on temperature 
and concentration, we use 10^ — 10^ MD steps for equi- 
libration and 10^ — 10® MD steps for data generation on 
NVIDIA Tesla C2050 GPUs, equivalent to 1 - 10 ^s. 



III. THEORY 

We build on the theoretical framework previously de- 
veloped to account for the linear aggregation and col- 
lective ordering of quasi-cylindrical particles [Hj. Here, 
we provide a discussion of how such a theory can be 
used to describe the reversible chaining and ordering of 
oligomeric DNADs at the level of detail adopted by the 
present model. According to Ref. |;44j, the free energy of 
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FIG. 1: (a) Schematic representation of the coarse graining of the model for a single nucleotide, (b) Model interaction sites. 
For the sake of clarity, stacking and hydrogen-bonding sites are highlighted on one nucleotide and the base repulsion site on the 
other. For visualization reasons, in the following strands will be shown as ribbons and bases as extended plates as depicted in 
(c). (d) A 12 base-pairs DNAD in the minimum energy configuration, (e) An equilibrium configuration of the same object at 
T = 300 K. The nucleotides at the bottom are not bonded, the so-called fraying effect, (f) A chain of length A^i, — 48 extracted 
from a simulation at c = 200 mg/ml. 



a system of equilibrium polymers can be written as 
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where V is the volume of the system, = VdP {f — 
N/V is the number density of monomers) is the packing 
fraction, is the number density of chains of length /, 
normalized such that ^ ^(0 — Pi ''^d is the volume of 

a monomer, AFb, as discussed in the subsection |IV B 
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a parameter which depends on the free energy associated 
to a single bond and Vexci{lJ') is the excluded volume 
of two chains of length / and /'. ?7((/)) is the Parsons-Lee 
factor [63J 
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and (7^.(0^43) accounts for the orientational entropy that 
a chain of length I loses in the nematic phase (including 
possible contribution due to its flexibility). The explicit 
form for ao{l) can be found in Ref. 44J. 

The free energy functional (Eq. [T]) explicitly accounts 
for the polydispersity inherent in the equilibrium poly- 
merization using a discrete chain length distribution and 
for the entropic and energetic contributions of each single 
bond through the parameter AFb. 



1. Isotropic phase 

In the isotropic phase (Tq = and the excluded volume 
can be written as follows (see Appendix IaI): 
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where the parameters Bj and kj can be estimated via MC 
integrals of a system composed by only two monomers 
(see Appendix [A]) and Xq is the aspect ratio of the 
monomers. We assume that the chain length distribu- 
tion i>{l) is exponential |44j with an average chain length 
M 
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FIG. 2: Snapshots taken from simulations at T = 300 K. 
At low concentrations (c — 2mg/ml, top) chain formation is 
negligible and the average chain length is approximately 1. 
As the concentration is increased (c — 80mg/ml, bottom), 
DNADs start to self-assemble into chains and the average 
chain length increases. 



Minimization of the free energy in Eq. ^ with respect 
to M provides the following expression for the average 
chain length M{(j)): 



M = 
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2. Nematic phase 

In the nematic phase the monomer orientational dis- 
tribution function f{9) depends explicitly on the angle 9 
between the particle and the nematic axis, i.e. the di- 
rection of average orientation of the DNAD, since the 
system is supposed to have azimuthal symmetry around 



such axis. We assume the form proposed by Onsager |51) . 
i.e.: 
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where a controls the width of the angular distribution. 
Its equilibrium value is obtained by minimizing the free 
energy with respect to a. As discussed in Appendix [A} 
we assume the following form for the excluded volume in 
the nematic phase: 



we.d {I, l\Xo,a)^ 2BN{a)Xll I' + 2vdk%^ {a) 
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where the term 2vdkj^^ (a) is the end- midsection contri- 
bution to the excluded volume of two hard cylinders (see 
Appendix Ib|) and 
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In Eq. (10), D is the diameter of the monomer and rik 
with k = 1,2,3 are three parameters that we chose in 
order to reproduce the excluded volume calculated from 
MC calculations as discussed in Appendi x \K\ 

Inserting Eqs. ^ and Q into Eq. ([1| and assum- 
ing once more an exponential distribution for v[l) one 
obtains after some algebra: 
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where (Tq = X); '^o{l)v{l). The explicit calculation of the 
parameters and fc^*^ is explained in Appendices [a| 
andE 

Assuming that the orientational entropy do can be ap- 
proximated with the expression valid for long chains |43j , 
minimization with respect to M results in 

M = ^ (l + Vl + a0e'=«(")'^''('^)+/5^-P^'') . (12) 

while using the approximated expression for short 
chains FO, one obtains 



M 



[l + Vl + 4a(/)efc«(")'^''(<^')+^^-P''-i) . (13) 



The equilibrium value of a is thus determined by fur- 
ther minimizing the nematic free energy in Eq. (Ill, 



which has become only a function of a. The parame- 
ter a is related to the degree of orientational ordering 
in the nematic phase as expressed by the nematic order 
parameter S as follows: 
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S{a)^ J {3cos^e~l)fai0)nsmede^l-3/a. (14) 

Further refinements of tire tlreory could be obtained by 
including a more accurate description of the orientational 
distribution /„ (9) in the proximity of the I-N phase tran- 
sition, along the lines of Eqs. (40)-(42) of Ref. [H]. For 
the sake of simplicity we have just presented the basic 
theoretical treatment. However, in the theoretical cal- 
culations in Sec. IIVI we will make use of the refined and 
more accurate free energy proposed in Ref. [44] . 



3. Phase Coexistence 

The phase boundaries, at which the aggregates of 
DNAD are sufficiently long to induce macroscopic ori- 
entational ordering, are characterized by coexisting 
isotropic and nematic phases in which the volume frac- 
tion of DNADs are, respectively, (/jjv — VdPN and 0/ = 
VdPi- The number densities pi and pN can be calculated 
by minimizing Eq. ^ with respect to Mj and by mini- 
mizing Eq. (11) with respect to Mjv and a. In addition, 
the two phases must be at equal pressure, i.e. Pj = Pj^, 
and chemical potential, i.e. pi = /ijy. These conditions 
yield the following set of equations: 



where tun = 330 Da is the average mass of a nucleotide. 
Hence, in the following c/ and cm will be used in place 
of <j)i and (pM- 

First we calculate the dimensions (height L and width 
D) of the DNADs for different c and T. We observe no 
concentration dependence on both quantities, while the 
variation in T is negligible (of the order of 0.1% between 
DNADs of samples at 270 K and 300 K) . The effect of this 
small change does not affect substantially the value of 
the aspect ratio, which we consider constant {X^ = 2.06) 
throughout this work. 

The geometrical properties of end-to-end bonded du- 
plexes are not well-known since there are no experimental 
ways to probe such structures. In a very recent work, the 
interaction between duplex terminal base-pairs has been 
analyzed by means of large-scale full-atom simulations 
by Maffeo et al. [64] . They found that blunt-ended du- 
plexes (i.e. duplexes without dangling ends) have pref- 
erential binding conformations with different values of 
the aziniuthal angle 7, defined as the angle between the 
projections onto the plane orthogonal to the axis of the 
double helix of the vectors connecting the 05' and 03' 
terminal base pairs. They report two preferential values 
for 7, namely 7 = -20° and 7 = 180°. 
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IV. RESULTS AND DISCUSSION 
A. Properties of the model 

To characterize structural and geometrical properties 
of the simulated DNADs monomers and aggregates we 
analyze conformations of duplexes extracted from large- 
scale GPU simulations (see Figure[2]for some snapshots). 

In the following, the volume Vd occupied by a single 
DNAD of length XqD and double helix diameter D {D 
2 nm) will be calculated as the volume of a cylinder with 
the same length and diameter, i.e. Vd = ttXqD^ /A. When 
comparing numerical and experimental results with the- 
oretical predictions we use the number of nucleotides N^, 
in place of ATo (Ao ~ 0.172Af,) and the concentration c 
instead of the packing fraction 0, which can be related 
to the former via: 
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FIG. 3: Probability distributions for (a) the azimutlial angle 
7 and (b) the end-to-end distance r. 
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In the present model the continuity of the hehx under 
end-to-end interactions is intrinsic in the model and the 
azimuthal angle probability distribution is peaked around 
a single value 70 ~ 40° (see Figure ^a)). This IS very 
close to the theoretical value 7 « 36° given by the pitch 
of the B-DNA double helix. The qualitative difference 
between the conformations of bonded DNADs found in 
this work and in Ref. 64 should be addressed in future 
studies describing the coaxial end-to-end interaction in a 
more proper way. 

In addition, we calculate the average distance r be- 
tween the centres of masses of the terminal base pairs. 
Figure |3][b) shows P{r), the probability distribution of 
r. P{r) is peaked at 0.39 nm, whereas Maffeo et al. [M] 
found an average distance of r « 0.5 nm. This differ- 
ence can be understood in terms of the effect of the salt 
concentration which, being five times higher than the 
one used in Ref. [M], increases the electrostatic screen- 
ing, thus effectively lowering the repulsion between DNA 
strands. 

The effect of the temperature is small, as lowering T 
leads only to more peaked distributions for both ^(7) 
and P{r) (and a very small shift towards smaller angles 
for 7) but does not change the overall behavior. 



B. Stacking free energy and excluded volume 

In this section we discuss the procedure employed to 
evaluate the input quantities required by the theory, 
namely Ai^b and fexc/(',^')- To this aim we perform a 
Monte Carlo integration over the degrees of freedom of 
two duplexes. AF;, is defined as [44] 



/3AFh = In 



A(T) 



Vd 
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-/3v(ri2,ni,n2) 



1] dv 
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Here T12 is the vector joining the center of masses of 
particles 1 and 2, 0,i is the orientation of particle i and 
(. . .) represents an average taken over all the possible ori- 
entations. Vf, is the bonding volume, defined here as the 
set of points where the interaction energy V{yi2, Hi, $72) 
between duplex 1 and duplex 2 is less than ksT. To 
numerically evaluate A(T) we perform a MC integration 
using the following scheme: 



4. Insert a randomly oriented duplex i in a random 
position in a cubic box of volume V = 1000 mv? . 
Insert a second duplex j in a random position and 
with a random orientation. Compute the interac- 
tion energy V{i,i) between the two duplexes i and 
j and, if V{i,i) < ksT, update the energy factor, 
F = F + (e~^^(*J) - 1). Increment iVt,ios- 

5. Repeat from step 3, until A(r) = | F con- 
verges within a few per cent precision. 

The employed procedure to compute Vexci{l, V) is fairly 
similar except that it is performed for duplexes with a 
various number of bases (i.e. with different Xq) and the 
quantity F counts how many trials originate a pair con- 
figuration with V{i,j) > ksT (i.e. in step 4, F = F-l-1). 
In the nematic case, the orientations of the duplexes are 
extracted randomly from the Onsager distribution given 
by Eq. [5] With such procedure. 
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(19) 



We calculate Vexci for 8 values of a, ranging from 5 to 
45 (see Appendix |B]) . Since the Xq and I dependences of 
Eqs. ^ and ^ are the same and the Xq dependence of 
the numerically calculated Vexci on the shape of DNADs 
is negligible, the evaluation of the excluded volume as 
a function of Xq provides the same information as the 
evaluation of Vexci as a function of I. 

Fig. [4] shows A(T) for all investigated T in a In A vs 
1/T plot. A linear dependence properly describes the 
data at the three T. An alternative way to evaluate A(T) 
is provided by the limit p — > of Eq. ([7 1. Indeed in 
the low density limit M and A(T) are related via the 
following relation: 



A(r) 



Af (1 - M ) 



(20) 



Therefore it is also possible to estimate A(r) by ex- 
trapolating the low density data for M at T = 270 K, 
285 K and 300 K. The results, also shown in Fig. [3) are 
in line with the ones obtained through MC calculations. 
The Arrhenius behavior of A(T) suggests that bonding 
entropy and stacking energy are in first approximation 
T independent. The coaxial stacking free energy Gst is 
related to A(r) as follows 



1 . Produce an ensemble of 500 equilibrium configura- 
tions of a single duplex at temperature T. 

2. Set the counter A'trios — and the energy factor 
F = 0. 

3. Choose randomly two configurations i and j from 
the generated ensemble. 



GsT = -kBTH2pA{T)]. 



(21) 



Substituting the fit expression provided in Fig. |4] 
for A(r) results in a stacking free energy Gg^ = 
— 0.086 kcal/mol at a standard concentration 1 M of 
DNADs and T = 293 K comprising a bonding entropy of 
—30.6 cal/molK and a bonding energy of —9.06 kcal/mol. 
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FIG. 4: A(r) calculated with the procedures described in 
Sec. [rVBl for all investigated T. 

C. Isotropic phase: comparing simulation results 
with theoretical predictions 



FIG. 5: Average chain length M in the isotropic phase at 
low concentration. Symbols are numerical results and dashed 
lines are theoretical predictions. Dotted lines are theoreti- 
cal predictions using the excluded volume of HCs v^J^i (see 
Appendix [b| . 



Fig. [5] shows the concentration c dependence of M cal- 
culated from the MD simulation of the A^;, = 12 sys- 
tem. The average chain length increases progressively 
on increasing c. The figure also shows the theoretical 
predictions calculated by minimizing the isotropic free 
energy in Eq. ^ with respect to M using the previously 
discussed estimates for AFi, and Vexci- The theoretical 
results properly describe the MD simulation data up to 
concentrations around 200 mg/ml, which corresponds to 
a volume fraction (f> w 0.20. In Ref. 44, similar observa- 
tions have been made and the discrepancy at moderate 
and high (j> has been attributed to the inaccuracy of the 
Parsons decoupling approximation. The M values cal- 
culated using the excluded volume of two hard cylinders 
(HC) are also reported, to quantify the relevance of the 
actual shape of the DNA duplex. Indeed the HC pre- 
dictions appreciably deviate from numerical data beyond 
100 mg/ml. 

D. Phase Coexistence: Theoretical predictions 

A numerical evaluation of the phase coexistence be- 
tween the isotropic and the nematic phases for the coarse- 
grained model adopted in this study is still impossible to 
obtain given the current computational power. We thus 
limit ourselves to the evaluation of the I-N phase coexis- 
tence via the theoretical approach discussed in Sec. |III| 
Fig. |6] shows the theoretical phase diagram in the c-Ni, 
plane for T = 270 K and 300 K. As expected, both c/ 
and Cat decrease on increasing Ni,, since the increase of 
the number of basis result in a larger aspect ratio. On 
decreasing T, theory predicts a 10% decrease of c/ and a 



similar decrease of cn , resulting in an overall shift of the 
I-N coexistence region toward lower c values. This trend 
is related to the increase of the average chain length M 
with increasing PAF^ (see Fig. [4]). 

Fig. [6] also shows the phase boundaries calculated using 
the excluded volume of two hard cylinders. Assimilating 
DNADs to hard cylinders results in a 10-15% widening 
of the isotropic-nematic coexistence region. 

E. Comparison between theory and experiments 

The theoretical predictions concerning the isotropic- 
nematic coexisting concentrations can be compared to 
the experimental results reported in Refs. [55] and [30] 
for blunt-ended DNADs. 

Figure [7] compares the experimentally determined ne- 
matic concentrations cn at coexistence with the values 
calculated from the present model for T = 293 K. De- 
spite all the simplifying assumptions and despite the ex- 
perimental uncertainty, the results provide a reasonable 
description of the Ni, dependence of cat. The experimen- 
tal data refer to different base sequences and different 
salt concentrations. According to the authors cn is af- 
fected by an error of about ±50 mg/ml. In particular for 
the case Ni, ~ 12 the critical concentrations cat for dis- 
tinct sequences shows that blunt-end duplexes of equal 
length but different sequences can display significantly 
different transition concentrations. Hence, for each du- 
plex length, we consider the lowest transition concentra- 
tion among the ones experimentally determined, since 
this corresponds to the sequence closest to the symmet- 
ric monomer in the model. Indeed the dependence of 
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FIG. 6: I-N phase diagram in the c vs Nt plane for T = 270 K 
(top) and 300 K (bottom) . Dotted hnes are theoretical phase 
boundaries calculated using the excluded volume of HCs v^J^i 
(see Appendix [B]). 



on the DNADs sequence is expected to be larger for the 
shortest sequences, i.e. Nt < 12, for which DNAD bend- 
ing could be significant [3D]. Unfortunately, quantitative 
experimental data on this bending effect are still lack- 
ing. In general it is possible that for < 12 (for 
which a large number of sequences have been studied, 
see Fig. [?]), would be corrected to lower values if a larger 
number of sequences were explored. For more details on 
this phenomenon, we refer the reader to the discussions 
in Refs. [301 IHI^- 

The overestimation of the phase boundaries for Nf, > 
12 with respect to experimental results suggests that 
the DNA model of Ouldridge et al. [52] overestimates 
the coaxial stacking free energy. Such discrepancy 
can perhaps be attributed to the restricted number of 
microstates allowing for bonding states in the DNA 
model [521 llS]j as discussed in Sec. IV Indeed, allow- 



ing DNADs to form end-to-end bonds with more than 
one preferred azimuthal angle would increase the entropy 



of bonding, thus effectively lowering Gst- Allowing for 
both left- and right-handed binding conformations, a pos- 
sibility supported by the results of Maffco et al. [64j . 
would double A(T) in Eq. ( |2l] ) and hence add an entropic 
contribution equal to — fcsnn(2) to GsTi which would 
result in a value G^y — 0.403 kcal/mol = — 0.49 kcal/mol 
for T = 29iK. Fig. [7] also shows the theoretical predic- 
tion for such upgraded Gst value. 

In Fig. [7] theoretical calculations of the I-N transition 
lines are shown for Gst = —0.4 kcal/mol and Gst — 
—2.4 kcal/mol at T = 293 K as the upper and lower 
boundaries of the grey band respectively. To calculate 
these critical lines we retain the excluded volume cal- 
culated in the subsection |IVB| and, given the value of 
Gst, we evaluate Ai^j, according to Eqs. (17 1 and (21) 
for T = 293 K and p corresponding to the standard 1 M 
concentration. 

The selected points with A^f, > 12 fall within the 
grey band shown in Fig. [7] enabling us to provide an 
indirect estimate of Gst between —0.4 kcal/mol and 
—2.4 kcal/mol. For the points with TV < 12, where duplex 
bending might play a role, it would be valuable to have 
more experimental points corresponding to more straight 
sequences in order to validate the theoretical predictions. 

It is worth observing that for all DNAD lengths 
the electrostatics interactions are properly screened. For 
TVb = 20 a concentration 1.2 M of NaCl has been added 
to the solution resulting in a Debye screening length 
fc^^ « 0.23 nm. For all other lengths (i.e. Ni, < 18) we 
note that at the lowest DNA concentration of 440mg/ml 
corresponding to Nb — 14 fc^^ « 0.40 nm. Therefore the 
experimental kjj^ is always smaller than the excluded vol- 
ume diameter for the backbone-backbone interaction of 
our coarse-grained model [55] (« 0.6 nm), thus enabling 
us to neglect electrostatic interactions. 

On the other hand, if electrostatics interactions are 
not properly screened the effective aspect ratio for such 
DNAD sequences would be smaller than the ones used 
in our theoretical treatment and this would result in a 
underestimate of c^r. To account for this behavior one 
should at least have a reasonable estimate of the effec- 
tive size of DNADs when electrostatics interactions are 
not fully screened. Moreover, the role of electrostatics 
interactions can be subtle and not completely accounted 
for by simply introducing an effective size of DNADs. A 
possible route to include electrostatics in our treatment 
can be found in Ref. and it will be addressed in future 
studies. 



F. Comparison with Onsager Theory 

The experimental average aggregation numbers are es- 
timated in Refs. [T31 [2H| by mapping the self-assembled 
system onto an "equivalent" mono-disperse system of 
hard rods with an aspect ratio equal to MXq- In 
Ref. [33] it has been shown that the theoretically es- 
timated isotropic-nematic coexistence lines for the case 
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FIG. 7: Critical nematic concentrations cat as a function 
of the number of base pairs per duplex A^;, for the present 
model, calculated theoretically at T = 293 K using the com- 
puted stacking free energy Gg-r (short dashed lines), Gst = 
—0.49 kcal/mol (long dashed lines), and for experiments [28] 
(circles and squares). Squares are cat for different sequences 
at the same Nt, — 12. The grey band has been built consid- 
ering for Gst an upper bound of —0.4 kcal/mol and a lower 
bound of —2.4 kcal/mol. 



of polymerizing superquadric particles in the MXq — (p 
plane, parametrized by the stacking energy, are signifi- 
cantly different from the corresponding Onsager original 
predictions (as re-evaluated in Ref. [35 ). In light of the 
relevance for interpreting the experimental data, we show 
in Figure [8] the same curves for the DNA model investi- 
gated here. In this model, a clear re-entrant behavior of 
the transition lines in the c — MXq plane is observed. 
The re-entrant behavior occurs for values of the stack- 
ing free energy accessed at temperatures between 270 K 
and 330 K. We believe that the re-entrancy of the tran- 
sition lines in the c — MXq plane is a peculiar mark of 
the system polydispersity resulting from the reversible 
self-assembling into chains, and as such it should be also 
observable experimentally. 



V. CONCLUSIONS 

In this article, we have provided the first study of a 
bulk solution of blunt ended DNA duplexes undergoing 
reversible self-assembly into chains, promoted by stack- 
ing interactions. The simulation study, carried out at dif- 
ferent concentrations and temperatures, provides a clear 
characterization of the c and T dependence of the aver- 
age polymerization length M and an indirect estimate 
of the stacking free energy. We have provided a theo- 
retical description of the self-assembly process based on 
a theoretical framework recently developed in Ref. 




FIG. 8: Isotropic-nematic coexistence lines in the average 
aspect ratio MXq and concentration c plane for two values 
of iVi,, namely iVb = 12 (top) and iVb = 20 (bottom). Solid 
lines indicate theoretical predictions, dashed lines indicate the 
Onsager original predictions, as re-evaluated in Ref. [35] for 
c/ and Cat. Symbols along the isotropic and nematic phase 
boundaries at coexistence are joined by dotted lines, to indi- 
cate the change in concentration and average chain length at 
the transition. 



The inputs required by the theory (the DNAD excluded 
volumes and the stacking free energy) have been nu- 
merically calculated for the present DNA model, allow- 
ing a parameter free comparison between the molecular 
dynamic results and the theoretical predictions. Such 
comparison has been limited to the isotropic phase, due 
to the difficulties to simulate the dense nematic phase 
under equilibrium conditions. The description of the 
isotropic phase is satisfactory: quantitative agreement 
between theory and simulations is achieved for concen- 
trations up to c « 200 mg/ml. The stacking free en- 
ergy value that properly accounts for the polymerization 
process observed in the molecular dynamics simulations 
is G^grp = —0.086 kcal/mol at a standard concentration 
1 M of DNADs and T = 293 K comprising a bond- 
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ing entropy of — 30.6cal/mol K and a bonding energy of 
-9.06kcal/mol. 

Theoretical predictions for the I-N transition have been 
compared with experimental results for several DNA 
lengths, ranging from 8 to 20 bases. For Nj, > 12 the 
model predicts values for cn which are higher than ex- 
perimental ones. This suggests that the DNA model em- 
ployed overestimates Gst- In view of the recent results 
of Maffeo et al. (53] , we speculate that the bonding en- 
tropy is underestimated, in agreement with the observa- 
tion that the probability distribution of the azimuthal 
angle between two bonded DNADs, which is designed 
to be single-peaked, is too restraining. In this respect, 
the present study call for an improvement of the coarse- 
grained potential [S5] in regard to the coaxial stacking 
interaction. 

The value of Gst can also be used as a fitting pa- 
rameter in the theory for matching cjv with the experi- 
mental results, retaining the excluded volume estimates 
calculated for the coarse-grained DNA model. Such pro- 
cedure shows that values of the stacking free energy be- 
tween — 0.4kcal/mol and — 2.4kcal/mol are compatible 
with the experimental location of the I-N transition line. 
In the work of Maffeo et al., the authors report a quite 
smaller value of Gst, namely = — 6.3 kcal/mol, a 

value which was confirmed by the same authors by per- 
forming an investigation of the aggregation kinetic in a 
very lengthy all-atom simulation of DNAD with Nf, = 10. 



If such Gst value is selected as input in our theoretical 
approach (maintaining the same excluded volume term) , 
then one finds cjj ~ 250 mg/ml, a value significantly 
smaller than the experimental result (cat — 650 ± 50 
mg/ml). This casts some doubts on the effectiveness 
of the employed all-atom force-field to properly model 
coaxial stacking. 

Finally, our work draws attention to the errors af- 
fecting the estimate of the average chain length M 
via a straightforward comparison of the nematic co- 
existing concentrations with analytic predictions based 
on the original Onsager theory for mono-disperse thin 
rods [131 HH] ■ We have found that such approximation 
significantly underestimates M at the I-N transition con- 
centration Cn ■ In addition, the theoretical approach pre- 
dicts a re-entrant behavior of the transition lines in the 
c-MXq plane, a distinct feature of the polydisperse na- 
ture of the equilibrium chains. 
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Appendix A: Excluded volume contributions 

Here we further discuss the calculation of the excluded 
volume term Vexcii^i^') for the present model. Follow- 
ing Ref. [U], the excluded volume is assumed to be the 
following second order polynomial in I and I': 

Ve,ci[lJ'-J{e)] = 2 1 f{e)f{6')D^ [*i(7,^o)+ 
l + V 



and Eq. (Al) reduces to the form 



*2 (7, ^0)^0+*3 (7,^0)^0"'^ 



(Al) 



where the functions a = 1, 2, 3, describe the angular 
dependence of the excluded volume. The orientational 
probability f{6) is normalized such that 



J f{e)dn = i. 



(A2) 



The three contributions to the excluded volume 



in Eq. (All come from end-end, end-midsection and 
midsection-midsection steric interactions [H] between 
two chains. 

In the isotropic phase the orientational distribution 
does not have any angular dependence, i.e. f{6) = l/47r, 



Vexclihl' , Xq) — 



I + I' 

kiVd—-+Ai. (A3) 



The parameters Bj, kj and Aj appearing in Eq. (A3) 



can be calculated via MC integration procedures as dis- 
cussed in the subsection IV B and in Ref. [Hj . We expect 



that these parameters do not depend on Xq because each 
DNADs comprises Ni, stacked base pairs which are all 
identical with respect to excluded volume interactions 
(i.e. they all have the same shape). In particular, the 
calculated excluded volume of two DNADs is reported 
in Fig. |9] for 5 different aspect ratios, together with the 
resulting values for the above parameters. 
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FIG. 9: Excluded volume in the isotropic phase together with 
analytic approximations. From the linear fit one has Bi = 
0.959D^ and fcj = 3.084, while we assume Ai = 



Using the Onsager angular distribution faiO) in 
I, the excluded volume in the nematic phase de- 
, so on the parameter a, i.e. the general form in 



Eq. (A3 



pends a 



Eq. (All reduces to 



Vexci{lJ',Xo,a) = BNia)XQll' + kN{a)vd 



l + l' 



+ An (a). 



(A4) 



Assuming that A^ia) — 0, k^ict) = k^'~^{a) and 
Biq{a) is given by Eq. (10 1, the three parameters rjk 
with fc = 1,2,3 have to be estimated. For I = 1' = 1 and 
several values of a (a = 5 ... 45 in steps of 5) and Xq 
we calculated numerically the nematic excluded volume 
for two DNADs. The results are shown in Fig. [TOj where 
we plot Vexci/vd vs Xq for various a. The dashed lines 
shown in Fig. [lO] are obtained through a two-dimensional 
fit to numerical data for Vg^xcii^^^- XQ,a) using Eq. ([9]) 
as fitting function. 
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Appendix B: Excluded volume of hard cylinders 
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FIG. 10: Excluded volume as a function of aspect ratio Xo in 
the nematic phase together with analytic approximations for 
several a. The dashed lines are obtained plotting the function 
reported in Eq. ^ and setting 771 = 0.386419, 772 = 1.91328 
and T]z = -0.836354. 

For two rigid chains of length I and V which are com- 
posed of hard cylinders (HCs) of diameter D and length 
{I, I') can be described by 
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where cos 7 — u • u', u and u' are the orientations of two 
HCs and i?(sin7) is the complete elliptical integral 
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The integrals in Eq. (Bl I can be calculated exactly in 



the isotropic phase, while in the nematic phase the calcu- 
lation can be done analytically only for suitable choices 
of the angular distribution f{0). Here we assume that 
the angular distribution is given by the Onsager function 
in Eq. 

Using the Onsager orientational function the follow- 
ing approximate expressions for the coefficients /cat (a), 
Bjv(a) and Ajv(q;) can be derived f43| 
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We evaluate numerically the excluded volume in Eq. 



(Bl) for many values of a and, building on the expres- 



sions in Eqs. (B3), we perform a fit to this data using 



the following functions: 



D 



C4 
-,9/2 



C5 



,11/2 



= 4 1- 



-E 

1=2 

(7r/4)2 (p,(a) 



C4 
-v9/2 



1 V — 

C5 



,11/2 



(B5) 
(B6) 
(B7) 



The coefficient values resulting from the fitting proce- 
dure are C4 = 1.2563, C5 = -0.95535, do = 3.0846, di = 
-4.0872, d2 = 9.0137, = -9.009 and d^ = 3.3461. 



